What is Power? + Calculating Power.
We haven’t yet answered… How much data should we collect?
| \(H_0\) is True | \(H_0\) is False | |
|---|---|---|
| Reject \(H_0\) | Type I ( ___________) | No Error ( ___________) |
| Fail to Reject \(H_0\) | No Error ( ___________) | Type II ( ___________) |
Power \((1 - \beta)\)
The power of a test is the probability of correctly rejecting the null hypothesis if a particular alternative scenario is true.
\(\text{Power} = 1-\beta = P(\text{reject } H_0 \mid \text{a meaningful difference does in fact exist})\)
Power depends on:
A key idea: You cannot compute power for “something differs”
Suppose, we are preparing to conduct a study for:
\(\delta\) = difference worth detecting
Information Needed to Calculate the Power of a Test
\[F = \dfrac{\text{MSTrt}}{\text{MSE}}\]
If \(H_0\) is true, \(F \sim F(\text{df}_1,\text{df}_2)\) (Central \(F\))
If a specific \(H_A\) is true \(F \sim F(\text{df}_1,\text{df}_2,\lambda)\) (Noncentral \(F\))
If we have \(t\) treatments and \(r\) reps each:
Example 3.1: Skeleton ANOVA
| SV | DF: 12 runners - 1 = 11 total |
|---|---|
| Shoe Type | (4 - 1) = 3 |
| Runner(Shoe Type) –> error | (3 - 1)(4) = 8 |
\[\lambda = \frac{\sum_{i=1}^{t}(\mu_i-\bar{\mu})^2}{\sigma^2 / r}\]
Sanity check
If all means are equal, then \(\lambda = 0\).
Recall, the things we need to calculate power are:
1. A specific alternative (e.g., \(\delta\))
An ultra-marathon running expert tells us that if there is a difference of 10 seconds between the runners lap times, then that would be of practical importance. Thus, \(\delta = 10\).
Recall, the things we need to calculate power are:
2. Replications per treatment (\(r\))
We are conducting a study with \(t = 4\) and \(r = 3.\)
Recall, the things we need to calculate power are:
3. variability estimate (\(\sigma^2\))
From an analysis of a pilot study (from Mod 2: CRD Notes), we found \(\hat\sigma^2 = 17.46\)
Pilot Studies
Pilot Studies can be super helpful for estimating experimental error variance \((\hat\sigma^2)\) and differences in group means – key pieces of a power analysis. Running a small version of your experiment gives you data to calculate the mean square error and get a sense of the effect size. This makes it easier to plan the full study and ensure your design has enough power to detect meaningful differences.
Recall, the things we need to calculate power are:
4. Significance level (\(\alpha\))
We get to set this at \(\alpha = 0.05.\)
Consider the central F-distribution with these parameters:
Recall, that the power represents the probability of rejecting the null when a specific alternative is true. So let’s focus on the region where we’ll reject the null hypothesis (the Rejection Region)
Now, we must incorporate the part about the specific alternative being true. If our particular alternative is true, then the F-statistic actually follows a non-central F-distribution with the aforementioned parameters. This is plotted on the same plot as the central F-distribution below:
The power for this test is about ________. That is, there is only a 51% chance of correctly rejecting the null hypothesis under these conditions. Do you think this is a good test?
For most experiments, we would like the power of the test to be at least __________.
Even with the same max difference \(\delta\)…
| Mean pattern | Power (approx.) |
|---|---|
| Maximum power | 0.78 |
| Minimum power | 0.47 |
| Equally spaced | 0.51 |
| All but one equal | 0.65 |
DOE > Sample Size Explorers > Power > Power for ANOVA
Minimum Power = Maximum Difference > Worst Case
Fix \(\alpha\), \(\sigma^2\), \(\delta\), power
Solve for \(r\)
Fix \(\alpha\), \(\sigma^2\), \(r\), power
Solve for \(\delta\)
What is the the power of my test?
Balanced one-way analysis of variance power calculation
k = 4
n = 2
delta = 10
sigma = 4.178516
effect.size = 0.8461218
sig.level = 0.05
power = 0.2254578
NOTE: n is number in each group, total sample = 8 power = 0.225457781758005
What is the sample size I need?
Balanced one-way analysis of variance sample size adjustment
k = 4
sig.level = 0.05
power = 0.8
n = 5
NOTE: n is number in each group, total sample = 20